Recursive method for solving the inexact greatest common divisor problem

ABSTRACT

A method, system, and computer program product are provided for determining the greatest common divisor (GCD) for a plurality of data points. A plurality of interim solutions are generated from an initial set of at least one data point from the plurality of data points. An iterative algorithm is then performed until the occurrence of a termination event. The iterative algorithm includes selecting a new data point from the plurality of data points. Each of the plurality of interim solutions are updated according to the selected data point as to provide a set of at least one updated interim solution from each interim solution. Each updated interim solution is evaluated to produce a fitness parameter. An updated interim solution when the fitness parameter does not achieve a desired threshold.

BACKGROUND OF THE INVENTION

1. Technical Field

The invention relates generally to data analysis methodologies and, morespecifically, to systems and methods for solving the inexact greatestcommon divisor problem.

2. Description of the Prior Art

The greatest common divisor (GCD) problem was first solved for exactvalues (e.g., values without random noise) by Euclid as an iterativealgorithm around 300 B.C. In Euclid's algorithm, the GCD can bedetermined by dividing the larger number by the smaller to obtain aremainder value. If the remainder is zero, the GCD is the smaller of thetwo numbers. If the remainder is non-zero, the problem is repeated forthe smaller number and the remainder. This continues through a number ofiterations until a remainder of zero is achieved. The GCD is the divisorused to achieve the remainder of zero.

The problem is complicated significantly by the introduction of noiseinto the values. Unfortunately, most real world applications require thecapacity to analyze noisy data. Several limited solutions have beenfound to the inexact GCD problem, but they generally are useful only incertain circumstances, such as small data sets or data sets having onlya moderate level of noise. These limitations make existing solutionsinefficient for some applications.

SUMMARY OF THE INVENTION

In accordance with one aspect of the present invention, a method isprovided for determining the greatest common divisor for a plurality ofdata points. A plurality of interim solutions are generated from aninitial set of at least one data point from the plurality of datapoints. An iterative algorithm is then performed until the occurrence ofa termination event. The iterative algorithm includes selecting a newdata point from the plurality of data points. Each of the plurality ofinterim solutions are updated according to the selected data point as toprovide a set of at least one updated interim solution from each interimsolution. Each updated interim solution is evaluated to produce afitness parameter. An updated interim solution when the fitnessparameter does not achieve a desired threshold.

In accordance with another aspect of the present invention, a system isprovided for determining a greatest common divisor for a plurality ofnumerical data points. A system memory stores a pool of at least oneinterim solution. A solution updater updates the pool of interimsolutions according to a received data point to produce at least oneupdated interim solution. A solution evaluator evaluates each updatedinterim solution and calculates an estimated GCD for each of theplurality of solutions. The solution evaluator eliminates an updatedinterim solution when the likelihood that the estimated GCD associatedwith the interim solution is correct falls below a threshold value.

In accordance with yet another aspect of the invention, a computerprogram product, encoded on a computer readable medium and operative ina computer processor, is provided for determining a greatest commondivisor for a plurality of numerical data points. A system memory storesa pool of interim solutions. A solution updater receives a given datapoint from the plurality of numerical data points and updates the poolof interim solutions according to the received data point. A solutionevaluator evaluates each updated interim solution to produce a fitnessparameter and eliminates an updated interim solution when the fitnessparameter does not meet a desired threshold.

BRIEF DESCRIPTION OF THE DRAWINGS

The foregoing and other features of the present invention will becomeapparent to one skilled in the art to which the present inventionrelates upon consideration of the following description of the inventionwith reference to the accompanying drawings, wherein:

FIG. 1 illustrates a methodology for determining the greatest commondivisor of a plurality of numerical data points having associated randomnoise in accordance with one aspect of the present invention.

FIG. 2 illustrates a decision tree representing a plurality of interimsolutions to the greatest common divisor problem in accordance with anaspect of the present invention.

FIG. 3 illustrates an exemplary methodology for determining the greatestcommon divisor of a plurality of measurements containing random error inaccordance with an aspect of the present invention.

FIG. 4 illustrates a second exemplary methodology for determining thegreatest common divisor of a plurality of measurements containing randomerror in accordance with an aspect of the present invention.

FIG. 5 illustrates an exemplary system for determining the greatestcommon divisor of a sequence of data points containing random error inaccordance with an aspect of the present invention.

FIG. 6 illustrates a schematic block diagram of an exemplary operatingenvironment for a system configured in accordance with an aspect of thepresent invention.

DETAILED DESCRIPTION OF THE INVENTION

In accordance with an aspect of the present invention, methods andsystems for solving the inexact greatest common divisor problem areprovided. The methods and systems can be applied to any of a number ofapplications in which an efficient, robust solution to the inexactgreatest common divisor problem is desirable, such as the detection andlocation of radar emissions or the harmonic analysis of noisy data.

The present invention can be implemented, at least in part, as one ormore software programs. Therefore, the structures described herein maybe considered to refer either to individual modules and tasks within asoftware program or as an equivalent hardware implementation.

FIG. 1 illustrates a methodology 10 for determining the greatest commondivisor (GCD) of a plurality of numerical data points having associatedrandom noise in accordance with one aspect of the present invention. Themethodology begins at block 12, where a plurality of interim solutionsare generated from a set of at least one data point. Each interimsolution can comprise a set of at least one integer multiplierassociated with a given data point or linear combination of data points.For example, a range of possible values can be known for the GCDaccording to an associated application. The plurality of interimsolutions can be generated by dividing a given data point, or a linearcombination of data points, by associated minimum and maximum valuesfrom the range of possible values for the GCD.

At block 14, a new data point from the plurality of data points isselected. At block 16, the interim solutions are updated to incorporateanother multiplier value based on the selected data point. For example,a range can be calculated from a previous estimate of the GCD and theselected data point, and a set of integers within the range can bedetermined. A new set of interim solutions can be generated from eachexisting interim solution, wherein each interim solution in a given setcomprises the multiplier values comprising its associated existinginterim solution and one of the plurality of integer values.

At block 18, each of the updated interim solutions are evaluated. Forexample, a regression analysis can be performed using the multipliervalues associated with a given interim solution and the correspondingdata points. A fitness parameter, such as the sum squared error, can bedetermined from each regression to determine the fitness of thesolution. At block 20, interim solutions having a degree of fitness lessthan a desired threshold are eliminated from consideration. Eliminatedsolutions are not updated or evaluated when a new data point is added tothe analysis. Accordingly, the processing demands associated with themethodology 10 is decreased over a brute force approach.

At decision block 22, it is determined if a termination event hasoccurred. For example, the termination event can include the achievementof a sufficiently small sum squared error, the elimination of allinterim solutions but one, or the use of all available data points. Ifno termination event has occurred (N), the methodology returns to block14 to select a new data point and update the remaining interimsolutions. If a termination event has occurred (Y), the methodologyadvances to block 24, where the interim solution having the largestassociated GCD estimate is selected.

FIG. 2 illustrates a decision tree 30 representing a plurality ofinterim solutions to the greatest common divisor problem. The decisiontree includes a root node 32 and a plurality of layers of branches 34,36, 38, and 40. Each layer, 34, 36, 38, and 40 of the tree representsone of a plurality of data points used to generate the interim solution.A given node within each layer 34, 36, 38, and 40 represents a possiblevalue for the integer multiplier associated with the data pointrepresented by the layer. In the illustrated tree, the root node 32 doesnot represent a specific multiplier value, however, in someapplications, the root node 32 can represent a default first multipliervalue necessary for some data models.

The first layer 34 represents the initial interim solutions. Theindividual nodes represent a set of possible values for a firstmultiplier value, N₁. In a first iteration of a methodology associatedwith an aspect of the present invention, the initial interim solutionscan be updated to add another layer 36 of possible multiplier valuesassociated with a new data point. Each of the twelve paths from the rootnode 32 to a terminal node in the second layer 36 represents an interimsolution to the GCD problem. The solutions can be evaluated to determineif they represent a likely answer to the problem, with solutions havingan associated probability less than a threshold value, α, beingeliminated from further updating and consideration.

In the third layer 38, a third data point is used to update theremaining interim solutions with a third set of multiplier values, andthe updated solutions are evaluated. Again, solutions having associatedprobability values below the threshold probability, α, are eliminated.The fourth layer 40 represents the remaining solutions, updated with anadditional set of multiplier values to incorporate a fourth data point.At this stage in the illustrated example, only one interim solutionhaving an associated probability greater than the threshold remains.Accordingly, the remaining solution, comprising the set of multipliervalues, [N_(1,4), N_(2,1), N_(3,1), N_(4,1)], can be updated with anyremaining data points and utilized to calculated the GCD of the datapoints.

FIG. 3 illustrates an exemplary methodology 50 for determining thegreatest common divisor (GCD) of a plurality of measurements containingrandom error. In the exemplary methodology 50, each of a plurality ofdata points received by the system is modeled as a multiple of the GCDwith no offset and a random measurement error, such that:t _(k) =N _(k) T+W _(k)  (Eq. 1)

where k is an index associated with the data points, t_(k) is a k^(th)data point, T is the greatest common divisor for the data set, N_(k) isan integer multiplier associated with the k^(th) data point, and W_(k)is a random error from an Gaussian distribution having a mean of zeroand a known variance σ².

Accordingly, the GCD can be determined as a slope associated with a linerepresented by the plurality of data points.

At block 52, the index, k, is initialized to one. At block 54, allpossible values of N₁ are determined according to a known maximum value,T_(max), and a known minimum value, T_(min), for T and the data point,t₁. For example, a set of possible values for N₁, can include allintegers between a minimum value, t₁/T_(max), and a maximum value,t₁/T_(min). At this point in the process, the set of possible values forN₁ can be conceptualized as a first branching for a decision treerepresenting a plurality of interim solutions to the greatest commondivisor problem. Each interim solution is represented by the multipliervalues along one of a plurality of paths from a root of the decisiontree to an associated terminal branch. A first estimation of the slope,{circumflex over (T)}₀, can be determined for each value of N₁ as theratio of the first data point, t₁, and N₁. A first estimate of thevariance of the slope estimate, σ_(T) ², can be determined for eachpossible value of N₁ as the ratio of the variance of the measurementerrors, σ², and the square of the multiplier value, N₁.

The index k is incremented by one at block 56. The interim solutions areupdated at block 58 using a set of all possible values for N_(k). Thiscan be accomplished by selecting all integer values within a definedrange, such that: $\begin{matrix}{N_{k} \in \frac{{\hat{T}\quad t_{k}} \pm \sqrt{{t_{k}^{2}{\hat{T}}^{2}} - {\left( {{\hat{T}}^{2} - {K\quad\sigma_{T}^{2}}} \right)\left( {t_{k}^{2} - {K\quad\sigma^{2}}} \right)}}}{{\hat{T}}^{2} - {K\quad\sigma_{T}^{2}}}} & \left( {{Eq}.\quad 2} \right)\end{matrix}$

where k is an index associated with the data points, t_(k) is a k^(th)data point, N_(k) is an integer multiplier associated with the k^(th)data point, K is a (1−α) quantile value associated with a desiredconfidence value, (1−α) in a chi-squared distribution, σ² is thevariance associated with the measurement error, {circumflex over (T)} isthe most current estimate of the slope of a line represented by the datapoints, σ_(T) ² is the most current estimated variance associated withthe slope estimate.

For large data sets, it can be assumed that the estimated variance ofthe slope σ_(T) ² is roughly equal to the actual variance of themeasurement errors, σ², and the calculation of the range simplifies to:$\begin{matrix}{N_{k} \in \frac{t_{k} \pm {\sigma\quad\sqrt{K}}}{\hat{T}}} & \left( {{Eq}.\quad 3} \right)\end{matrix}$

From the determined range, a new set of terminal branches associatedwith the possible values for the current multiplier value, N_(k), can beappended onto the remaining branches of the decision tree. At black 60,an interim solution is selected from the available interim solutions.

At block 62, regression parameters can be calculated for the pluralityof data points to estimate the GCD or a value associated with the GCD.The regression analysis can utilize the k ordered pairs, (N_(k), t_(k)),formed by the multipliers and data points that have been incorporatedinto the interim solutions. In an exemplary implementation, a correctionvalue, T′, representing a correction for a previous estimated slope isdetermined to avoid large summed squared values that could reducenumerical precision of the calculation. The regression parameters can becalculated as: $\begin{matrix}\left\{ \begin{matrix}{S_{xx} = {\sum\limits_{i = 1}^{k}N_{i}^{2}}} & {S_{xy} = {\sum\limits_{i = 1}^{k}{N_{i}\Delta\quad t_{i}}}} & \quad \\{{\Delta\quad t_{i}} = {t_{i} - {N_{i}{\hat{T}}_{0}}}} & {T^{\prime} = \frac{S_{xy}}{S_{xx}}} & {\sigma_{T}^{2} = \frac{\sigma^{2}}{S_{xx}}} \\{S_{yy} = {\sum\limits_{i = 1}^{k}{\Delta\quad t_{i}^{2}}}} & {{k\quad E^{2}} = {S_{yy} - {S_{xy}T^{\prime}}}} & \quad\end{matrix} \right. & \left( {{Eq}.\quad 4} \right)\end{matrix}$

where k is an index associated with the data points, t_(k) is a k^(th)data point, N_(k) is an integer multiplier associated with the k^(th)data point, kE² is a summed squared error associated with the pluralityof data points, σ² is the variance associated with the measurementerror, {circumflex over (T)}₀ is a first estimate of a slope of a linerepresented by the data points, T′ is a correction value representing apresent estimate, {circumflex over (T)}, of the slope, such that{circumflex over (T)}=T′+{circumflex over (T)}₀, and σ_(T) ² is anestimated variance associated with the slope estimate.

The determined slope offset, T′, from the regression model allows anestimated GCD value to be determined for the interim solution, and thesummed squared error, kE², provides an indication of the confidence inthe solution.

At decision block 64, it is determined if the selected interim solutionrepresents a likely solution for the GCD. A test value equal to theratio of the summed squared error, kE², to the estimated variance of themeasurement errors, σ², can be determined and compared to a chi-squaredistribution with (k−1) degrees of freedom. If the test value isdetermined to lie outside of a desired confidence interval within thechi-square distribution (N), the interim solution is eliminated fromconsideration at block 66. In terms of the decision tree model, theterminal branch associated with the interim solution is removed, suchthat no further updates are applied to the branch. The methodology thenadvances to decision block 68.

If it is determined at block 64 that the test value lies within thedesired confidence value of the chi-square distribution (Y), themethodology advances directly to decision block 68. At block 68, it isdetermined if all of the interim solutions have been evaluated. If not(N), the methodology returns to block 60 to select a new interimsolution for evaluation. If all of the interim solutions have beenevaluated (Y), the methodology advances to decision block 70, where itis determined if a termination event has occurred. For example, thetermination event can include the achievement of a sufficiently smallsum squared error, the elimination of all interim solutions but one, orthe use of all available data points. If no termination event hasoccurred (N), the methodology returns to block 66 to increment theindex, k, and update the remaining interim solutions in light of the newdata point. If a termination event has occurred (Y), the remaininginterim solution having the largest associated slope value, T, isselected at block 62 to provide the GCD for the model, and themethodology terminates.

FIG. 4 illustrates a second exemplary methodology 100 for determiningthe greatest common divisor (GCD) of a plurality of measurementscontaining random error. In the exemplary methodology 100, each of aplurality of data points received by the system is modeled as a multipleof the GCD with an offset value that is constant across the plurality ofdata points and a random measurement error, such that:t _(k) =N _(k) T+T _(d) +W _(k)  (Eq. 5)

where k is an index associated with the data points, t_(k) is a k^(th)data point, T is the greatest common divisor for the data set, T_(d) isa constant offset value, N_(k) is an integer multiplier associated withthe k^(th) data point, and W_(k) is a random error from an Gaussiandistribution having a mean of zero and a known variance σ².

Accordingly, the GCD and the offset can be determined, respectively, asslope and intercept values associated with a line represented by theplurality of data points.

At block 102, the index, k, is initialized to two, and a firstmultiplier value, N₁, is initialized to zero. At block 104, all possiblevalues of N₂ are determined according to a known maximum value, T_(max),and a known minimum value, T_(min), for T and the first two data points,t₁ and t₂. For example, a set of possible values for N₂, can include allof the integers in a range defined as follows: $\begin{matrix}{{N_{2} \in \left\lbrack {\frac{t_{2} - t_{1}}{T_{\max}},\frac{t_{2} - t_{1}}{T_{\min}}} \right\rbrack},{N_{2} \neq 0}} & \left( {{Eq}.\quad 6} \right)\end{matrix}$

At this point in the process, the set of possible values for N₂ can beconceptualized as a first branching for a decision tree representing aplurality of interim solutions to the greatest divisor problem. Eachinterim solution is represented by the multiplier values along one of aplurality of paths from a root of the decision tree to an associatedterminal branch. The value of the first data point, t₁, can be utilizedas a first estimation of the offset, {circumflex over (T)}_(d0), for allvalues of N₂. The variance of the measurement error, σ², can be used asan estimate for a first estimate, σ_(Td0) ²,of the variance of theoffset estimate. A first estimation of the GCD, and accordingly theslope of the line represented by the data points, can be determined foreach possible value of N₂ as: $\begin{matrix}{{\hat{T}}_{0} = \frac{t_{2} - t_{1}}{N_{2}}} & \left( {{Eq}.\quad 7} \right)\end{matrix}$

A first estimate of the variance, σ_(T0) ², of the slope estimate can bedetermined for each possible value of N₂ as: $\begin{matrix}{\sigma_{T\quad 0}^{2} = \frac{2\quad\sigma^{2}}{N_{2}}} & \left( {{Eq}.\quad 8} \right)\end{matrix}$

The index k is incremented by one at block 106. The interim solutionsare updated at block 108 using a set of all possible values for N_(k).This can be accomplished by selecting all integer values within adefined range, such that: $\begin{matrix}{N_{k} \in \frac{\left( {{\hat{T}\left( \quad{t_{k} - {\hat{T}}_{d}} \right)} \pm \sqrt{\begin{matrix}{{\left( {t_{k} - {\hat{T}}_{d}} \right)^{2}{\hat{T}}^{2}} - \left( {{\hat{T}}^{2} - {K\quad\sigma_{T}^{2}}} \right)} \\\left\lbrack {\left( {t_{k} - {\hat{T}}_{d}} \right)^{2} - {K\quad\left( {\sigma^{2} - \sigma_{Td}^{2}} \right)}} \right\rbrack\end{matrix}}} \right)}{\left( {{\hat{T}}^{2} - {K\quad\sigma_{T}^{2}}} \right)}} & \left( {{Eq}.\quad 9} \right)\end{matrix}$

where k is an index associated with the data points, t_(k) is a k^(th)data point, N_(k) is an integer multiplier associated with the k^(th)data point, K is a (1−α) quantile value associated with a desiredconfidence value, (1−α) in a chi-squared distribution, σ² is thevariance associated with the measurement error, {circumflex over (T)} isan estimate of a slope of a line represented by the data points, σ_(T) ²is an estimated variance associated with the slope estimate, {circumflexover (T)}_(d) is an estimate of an offset value (e.g., y-intercept) of aline represented by the data points, and σ² _(T) _(d) is an estimatedvariance associated with the y-intercept estimate.

For large data sets, it can be assumed that the variance of the slopeσ_(T) ² and the variance of the offset σ² _(T) _(d) is roughly equal tothe actual variance of the measurement errors, σ², and the calculationof the range simplifies to: $\begin{matrix}{N_{k} \in \frac{\left( {t_{k} - {\hat{T}}_{d}} \right) \pm {\sigma\sqrt{K}}}{\hat{T}}} & \left( {{Eq}.\quad 10} \right)\end{matrix}$

From the determined range, a new set of terminal branches associatedwith the possible values for the current multiplier value, N_(k), can beappended onto the remaining branches of the decision tree. At block 110,an interim solution is selected from the available interim solutions.

At block 112, regression parameters can be calculated for the pluralityof data points to estimate the GCD or a value associated with the GCD,and the offset value. The regression analysis can utilize the k orderedpairs, (N_(k), t_(k)), formed by the multipliers and data points thathave been incorporated into the interim solutions. In an exemplaryimplementation, correction values, T′and T_(d)′, representingcorrections for previously estimated slope and offset values, aredetermined to avoid large summed squared values that could reducenumerical precision of the calculation.

The regression parameters can be calculated as: $\quad\begin{matrix}\left\{ \begin{matrix}{S_{x} = {\sum\limits_{i = 1}^{k}N_{i}}} & {{\Delta\quad t_{i}} = {t_{i} - {N_{i}{\hat{T}}_{0}} - {\hat{T}}_{d\quad 0}}} & {{T^{\prime} = \frac{{k\quad S_{xy}} - {S_{x}S_{y}}}{D}}\quad} \\{S_{xx} = {\sum\limits_{i = 1}^{k}N_{i}^{2}}} & {S_{\quad{xy}} = {\sum\limits_{i\quad = \quad 1}^{\quad k}{N_{\quad i}\Delta\quad t_{\quad i}}}} & {T_{d}^{\prime} = \frac{{S_{xx}\quad S_{y}} - {S_{x}S_{xy}}}{D}} \\{D = {{k\quad S_{xx}} - \left( S_{x} \right)^{2}}} & {S_{yy} = {\sum\limits_{i = 1}^{k}{\Delta\quad t_{i}^{2}}}} & {\sigma_{T}^{2} = \frac{S_{xx}\sigma^{2}}{D}} \\{S_{y} = {\sum\limits_{i = 1}^{k}{\Delta\quad t_{i}}}} & {{k\quad E^{2}} = {S_{yy} - {S_{xy}\left( {T_{k - 1}^{\prime} - T_{k}^{\prime}} \right)} - {S_{y}\left\lbrack {\left( T_{d}^{\prime} \right)_{k - 1} - \left( T_{d}^{\prime} \right)_{k}} \right\rbrack}}} & {\quad{\sigma_{Td}^{2} = \frac{k\quad\sigma^{2}}{D}}}\end{matrix} \right. & \left( {{Eq}.\quad 11} \right)\end{matrix}$

where k is an index associated with the data points, t_(k) is a k^(th)data point, N_(k) is an integer multiplier associated with the k^(th)data point, kE² is a summed squared error associated with the pluralityof data points, σ² is the variance associated with the measurementerror, {circumflex over (T)}₀ is a first estimate of a slope of a linerepresented by the data points, {circumflex over (T)}_(d0) is a firstestimate of an offset value (e.g., y-intercept) for a line representedby the data points, T′ is a correction value representing a presentestimate, {circumflex over (T)}, of the slope, such that {circumflexover (T)}=T′+{circumflex over (T)}₀, T_(d)′is a correction valuerepresenting a present estimate, {circumflex over (T)}_(d), of theoffset value, such that {circumflex over (T)}_(d)=T_(d)′+{circumflexover (T)}_(d0), σ_(T) ² is an estimated variance associated with theslope value estimate, and σ² _(T) _(d) is an estimated varianceassociated with the offset value estimate.

The determined slope offset, T′, from the regression model allows anestimated GCD value to be determined for the interim solution, and thesummed squared error, kE², provides an indication of the confidence inthe solution. It will be appreciated that the statistics described aboveare well-suited to iterative updating, such that the processing demandsof the methodology are reduced.

At decision block 114, it is determined if the selected interim solutionrepresents a likely solution for the GCD and the constant offset. A testvalue equal to the ratio of the summed squared error, kE², for a givensolution to the estimated variance of the measurement errors, σ², can bedetermined and compared to a chi-square distribution with (k−1) degreesof freedom. If the test value is determined to lie outside of a desiredconfidence interval within the chi-square distribution (N), the interimsolution is eliminated from consideration at block 116. In terms of thedecision tree model, the terminal branch associated with the interimsolution is removed, and no further updates are applied to the branch.The methodology then advances to decision block 118.

If it is determined at block 114 that the test value lies within thedesired confidence value of the chi-square distribution (Y), themethodology advances directly to decision block 118. At block 118, it isdetermined if all of the interim solutions have been evaluated. If not(N), the methodology returns to block 110 to select a new interimsolution for evaluation. If all of the interim solutions have beenevaluated (Y), the methodology advances to decision block 120, where itis determined if a termination event has occurred. For example, thetermination event can include the achievement of a sufficiently smallsum squared error, the elimination of all interim solutions but one, orthe use of all available data points. If no termination event hasoccurred (N), the methodology returns to block 116 to increment theindex, k, and update the remaining interim solutions in light of the newdata point. If a termination event has occurred (Y), the remaininginterim solution having the largest associated slope value, T, isselected at block 112 to provide the GCD and offset values for themodel, and the methodology terminates.

FIG. 5 illustrates an exemplary system 200 for determining the greatestcommon divisor (GCD) of a sequence of data points containing randomerror in accordance with an aspect of the present invention. The systemattempts to find respective integer multipliers for the plurality ofdata points and determine the common divisor used to generate the datapoints. In one implementation, a common offset value can also bedetermined for the plurality of data points, such that a given datapoint can be represented as:t _(k) =N _(k) T+T _(d) +W _(k)  (Eq. 12)

where k is an index associated with the data points, t_(k) is a k^(th)data point, T is the greatest common divisor for the data set, T_(d) isa constant offset value, N_(k) is an integer multiplier associated withthe k^(th) data point, and W_(k) is a random error from an Gaussiandistribution having a mean of zero and a known variance σ².

The exemplary system 200 can be utilized in any situation in which it isuseful to determine a GCD of a plurality of values that incorporaterandom noise. For example, the system 200 could be incorporated into asystem for performing harmonic analysis on noisy data. In an exemplaryimplementation, the system 200 can be used in a Doppler emittergeolocation system to analyze a plurality of pulse times of arrival andcalculate the period of a base clock used to generate the pulses.

It will be appreciated that the illustrated system 200 can beimplemented as one or more computer programs, executable on one or moregeneral purpose data processors. Accordingly, any structures hereindescribed can be implemented alternately as dedicated hardware circuitryfor the described function or as a program code stored as part of acomputer-assessable memory, such as a computer hard drive, random accessmemory, or a removable disk medium (e.g., magnetic storage media, flashmedia, CD and DVD media, etc.). Functions carried out by the illustratedsystem, but not helpful in understanding the claimed invention, areomitted from this diagram.

The system 200 includes a memory 202 that stores an interim solutionpool 203 that comprises a plurality of interim solutions to a givenproblem. Each interim solution includes a set of at least one integermultiplier value, with each multiplier value corresponding to one of theplurality of data points. An estimate of the GCD can be determined, forexample, via a regression analysis or, in the case of a single value,simple division.

For each new data point provided to the system, the interim solutionscan be updated at a solution updater 204. The solution updater 204 usesthe new data point to provide a set of one or more new interim solutionsfrom each existing interim solution. The solution updater 204 candetermine, for a given existing interim solution, a plurality ofpossible multiplier values associated with the new data point. Forexample, the possible multiplier values can include all integers withina range defined by the data point and the estimate of the GCD associatedwith the existing interim solution. The set of new interim solutions foreach existing interim solution includes a new interim solutionincorporating each possible multiplier value.

The updated interim solutions are provided to a solution evaluator 206.The solution evaluator 206 evaluates each of the updated interimsolutions to determine if it is likely that the interim solutionrepresents an accurate representation of the data points and theirassociated GCD. For example, the multiplier values comprising theinterim solution and the data points associated with the multipliervalues can be subjected to a regression analysis to determine how wellthey fit an associated data model. An estimate of the GCD can becalculated for each updated interim solution according to the results ofthis analysis. Models that are determined to fit the model poorly, suchthat they have a low probability of providing an accurate GCD, can beeliminated from the pool of interim solutions 202. Accordingly, the poolof interim solutions 202 can be narrowed with each new data point toreduce the computational demands on the system 200. When all of the datapoints have been considered or the pool of interim solutions has beenreduced to a single solution, a GCD can be calculated from one of theremaining solutions.

FIG. 6 illustrates a computer system 300 that can be employed toimplement systems and methods described herein, such as based oncomputer executable instructions running on the computer system. Thecomputer system 300 can be implemented on one or more general purposenetworked computer systems, embedded computer systems, routers,switches, server devices, client devices, various intermediatedevices/nodes and/or stand alone computer systems. Additionally, thecomputer system 300 can be implemented as part of a computer-aidedengineering (CAE) tool running computer executable instructions toperform a method as described herein.

The computer system 300 includes a processor 302 and a system memory304. A system bus 306 couples various system components, including thesystem memory 304, to the processor 302. Dual microprocessors and othermulti-processor architectures can also be utilized as the processor 302.The system bus 306 can be implemented as any of several types of busstructures, including a memory bus or memory controller, a peripheralbus, and a local bus using any of a variety of bus architectures. Thesystem memory 304 includes read only memory (ROM) 308 and random accessmemory (RAM) 310. A basic input/output system (BIOS) 312 can reside inthe ROM 308, generally containing the basic routines that help totransfer information between elements within the computer system 300,such as a reset or power-up.

The computer system 300 can include a hard disk drive 314, a magneticdisk drive 316, e.g., to read from or write to a removable disk 318, andan optical disk drive 320, e.g., for reading a CD-ROM or DVD disk 322 orto read from or write to other optical media. The hard disk drive 314,magnetic disk drive 316, and optical disk drive 320 are connected to thesystem bus 306 by a hard disk drive interface 324, a magnetic disk driveinterface 326, and an optical drive interface 334, respectively. Thedrives and their associated computer-readable media provide nonvolatilestorage of data, data structures, and computer-executable instructionsfor the computer system 300. Although the description ofcomputer-readable media above refers to a hard disk, a removablemagnetic disk and a CD, other types of media which are readable by acomputer, may also be used. For example, computer executableinstructions for implementing systems and methods described herein mayalso be stored in magnetic cassettes, flash memory cards, digital videodisks and the like.

A number of program modules may also be stored in one or more of thedrives as well as in the RAM 310, including an operating system 330, oneor more application programs 332, other program modules 334, and programdata 336.

A user may enter commands and information into the computer system 300through user input device 340, such as a keyboard or a pointing device(e.g., a mouse). Other input devices may include a microphone, ajoystick, a game pad, a scanner, a touch screen, or the like. These andother input devices are often connected to the processor 302 through acorresponding interface or bus 342 that is coupled to the system bus306. Such input devices can alternatively be connected to the system bus306 by other interfaces, such as a parallel port, a serial port or auniversal serial bus (USB). One or more output device(s) 344, such as avisual display device or printer, can also be connected to the systembus 306 via an interface or adapter 346.

The computer system 300 may operate in a networked environment usinglogical connections 348 to one or more remote computers 350. The remotecomputer 348 may be a workstation, a computer system, a router, a peerdevice or other common network node, and typically includes many or allof the elements described relative to the computer system 300. Thelogical connections 348 can include a local area network (LAN) and awide area network (WAN).

When used in a LAN networking environment, the computer system 300 canbe connected to a local network through a network interface 352. Whenused in a WAN networking environment, the computer system 300 caninclude a modem (not shown), or can be connected to a communicationsserver via a LAN. In a networked environment, application programs 332and program data 336 depicted relative to the computer system 300, orportions thereof, may be stored in memory 354 of the remote computer350.

1. A method for determining the greatest common divisor (GCD) for a plurality of data points comprising: generating a plurality of interim solutions from an initial set of at least one data point from the plurality of data points; and iteratively performing the following steps until the occurrence of a termination event: selecting a new data point from the plurality of data points; updating each of the plurality of interim solutions according to the selected data point, as to provide a set of at least one updated interim solution from each interim solution; evaluating each updated interim solution to produce a fitness parameter; and eliminating an updated interim solution when the fitness parameter does not achieve a desired threshold.
 2. A method as set forth in claim 1, wherein the step of generating a plurality of interim solutions from an initial set of at least one data point includes generating an interim solution corresponding to each integer within a defined range.
 3. A method as set forth in claim 1, wherein the step of updating each of the plurality of solutions comprises the steps of: selecting a current interim solution from the plurality of interim solutions, the current interim solution comprising at least one associated multiplier value; defining a range of possible multiplier values for the selected data point according to a previous estimate of the GCD and the selected data point; identifying at least one integer within the defined range; and generating an updated interim solution for each identified integer, a given updated interim solution comprising the at least one multiplier value associated with the current interim solution and an identified integer.
 4. A method as set forth in claim 1, wherein the step of evaluating each updated interim solution comprises conducting a regression analysis on a set of multiplier values comprising a given interim solution and the plurality of data points.
 5. A method as set forth in claim 4, wherein the fitness parameter comprises a sum squared error parameter calculated as part of the regression analysis.
 6. A method as set forth in claim 1, wherein the step of eliminating an updated interim solution when the fitness parameter does not achieve a desired threshold comprises computing a test value from the fitness parameter and comparing the test value to a chi-squared distribution.
 7. A system for determining a greatest common divisor (GCD) for a plurality of numerical data points comprising: a system memory that stores a pool of at least one interim solution; a solution updater that updates the pool of interim solutions according to a received data point to produce at least one updated interim solution; and a solution evaluator that evaluates each updated interim solution, calculates an estimated GCD for each of the plurality of solutions, and eliminates an updated interim solution when the likelihood that the estimated GCD associated with the interim solution is correct falls below a threshold value.
 8. A system as set forth in claim 7, the solution updater being operative to retrieve an interim solution from the system memory, calculate at least one integer multiplier value from the received data point and a previous estimate of the GCD associated with the retrieved solution, and produce a corresponding set of at least one updated interim solution from the retrieved interim solution.
 9. A system as set forth in claim 8, wherein a given updated interim solutions comprise an associated integer multiplier value from the calculated at least one integer multiplier value and a set of at least one multiplier value associated with the retrieved interim solution.
 10. A system as set forth in claim 7, the solution evaluator being operative to determine a fitness parameter for a given updated interim solution, representing the likelihood that the estimated GCD associated with the interim solution is the correct GCD for the plurality of numerical data points.
 11. A system as set forth in claim 7, wherein the solution evaluator is operative to perform a regression analysis on the plurality of data points and a set of multiplier values comprising a given interim solution.
 12. An emission geolocation system comprising the system of claim
 7. 13. A computer program product, encoded on a computer readable medium and operative in a computer processor, for determining a greatest common divisor (GCD) for a plurality of numerical data points comprising: a system memory that stores a pool of interim solutions; a solution updater that receives a given data point from the plurality of numerical data points and updates the pool of interim solutions according to the received data point; and a solution evaluator that evaluates each updated interim solution to produce a fitness parameter and eliminates an updated interim solution when the fitness parameter does not meet a desired threshold.
 14. A Doppler emitter geolocation system comprising the computer program product of claim
 13. 15. A computer program product as set forth in claim 13, the solution updater being operative to retrieve an interim solution from the system memory, calculate at least one integer multiplier value from the received data point and produce a corresponding set of at least one updated interim solution from the retrieved interim solution.
 16. A computer program product as set forth in claim 15, wherein each of the updated interim solutions comprise an associated integer multiplier value from the calculated at least one integer multiplier value and a set of at least one multiplier value associated with the retrieved interim solution.
 17. A computer program product as set forth in claim 13, wherein the solution evaluator is operative to perform a regression analysis on the plurality of data points and a set of multiplier values comprising a given interim solution.
 18. A computer program product as set forth in claim 17, wherein the solution evaluator is operative to iteratively update a plurality of regression parameters associated with the regression analysis each time a numerical data point from the plurality of numerical data points is received.
 19. A computer program product as set forth in claim 13, wherein the solution evaluator is operative to compute a test value associated with an updated interim solution and eliminate the updated interim solution if the test value falls outside a confidence interval associated with a chi-squared distribution.
 20. A computer program product as set forth in claim 13, wherein the solution evaluator is operative to calculate an estimated GCD for each updated interim solution. 